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ACCURATE AND EFFICIENT COMPUTATION OF NONLOCAL 
POTENTIALS BASED ON GAUSSIAN-SUM APPROXIMATION 

LUKAS EXL*, NORBERT J. MAUSERt, AND YONG ZHANG t 

Abstract. We introduce an accurate and efficient method for a class of nonlocal potential 
evaluations with free boundary condition, including the 3D/2D Coulomb, 2D Poisson and 3D dipolar 
potentials. Our method is based on a Gaussian-sum approximation of the singular convolution kernel 
and Taylor expansion of the density. Starting from the convolution formulation, for smooth and fast 
decaying densities, we make a full use of the Fourier pseudospectral (plane wave) approximation of the 
density and a separable Gaussian-sum approximation of the kernel in an interval where the singularity 
(the origin) is excluded. Hence, the potential is separated into a regular integral and a near-field 
singular correction integral, where the first integral is computed with the Fourier pseudospectral 
method and the latter singular one can be well resolved utilizing a low-order Taylor expansion of the 
density. Both evaluations can be accelerated by fast Fourier transforms (FFT). The new method is 
accurate (14-16 digits), efficient (0(N log N) complexity), low in storage, easily adaptable to other 
different kernels, applicable for anisotropic densities and highly parallelable. 

Key words, nonlocal potential solver, free boundary condition, separable Gaussian-sum ap¬ 
proximation, Coulomb/Poisson/dipolar potential, singular correction integral 
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1. Introduction. In this paper, we aim to evaluate nonlocal potentials given 
originally by convolutions: 


i(x) = (U* p)(x) = f U(x - y)p{y)dy, 
J R d 


X £ 


d = 2,3, 


( 1 . 1 ) 


where * denotes the convolution operator, p(x) is the density function, and U (x) is 
a nonlocal (long-range) convolution kernel. Here we are interested in the cases where 
the density is smooth and decays fast. 

Nonlocal potentials exist in a variety of mathematical models from quantum 
physics / chemistry to material sciences, plasma physics and computational biology 
etc. The computation of the nonlocal potential is often the most time-consuming 
part in simulations, and the development of accurate and efficient numerical schemes 
remains an active and important topic in the science and engineering community. 
The 3D Coulomb potential (also called ’’Newtonian potential”), with t/(x) = 4 ^ x | , 
is fundamental and universal in many applications, such as Bose-Einstein Conden¬ 
sates ME 36,371 and quantum chemistry [21 23}|28]. Here, we study a class of 
nonlocal potentials with their kernels given explicitly as follows: 


U(x) = 


1 

47t|x| ’ 

3D Coulomb, 

1 

27t|x| ’ 

2D Coulomb, 

-^F ln l x l> 

2D Poisson. 


( 1 . 2 ) 


Of course, there are other nonlocal potentials and some of them can be re¬ 
formulated through the above kernels, e.g., the dipolar convolution kernel C/(x) = 
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3 m-n—3(x-n)(m-x)/|x|~ 
4-7T |x| 3 


where n, m £ 


are unit vectors 


ourselves to those given in (1.2) because they are both common and important. 



We restrict 


The convolution (1.1) can be represented formally as a Fourier integral 
1 


u(x) = 


(27r) a 


(J(k)p(k) e* k x dk 


x e . 


(1.3) 


where /(k) = f Rd /(x) e ikx dx is the Fourier transform of /(x) for x, k £ R d . 
Note that the Fourier transform of the convolution kernel U(k) is also long-range and 
singular, and sometimes the singularity is too strong that the Fourier representation 
is not well-defined, e.g., 1/1k| 2 for the 2D Poisson potential :6j. Another important 
equivalent formulation is to solve a partial differential equation in the whole space 
with appropriate far-held condition. For example, the 3D Coulomb potential satisfies 
the following Poisson equation, i.e., 


- Au(x) = jo(x), 


X £ . 


M 


lim u(x) = 0. 


(1.4) 


All of the three formulations are challenging numerically. In 0 and ( |1.3| ), the 
convolution kernels and their Fourier transforms are long-range and singular at the 
origin and/or at the far-held. Therefore, accurate and efficient evaluation requires 
either a large computational domain and/or elaborate strategies to take care of the 
singularity in physical and phase space, respectively. 

Various numerical methods have been proposed to solve the potential via the 
PDE approach on a rectangular domain with uniform mesh grid 14,5 38 . Take the 


3D/2D Coulomb potential as an example. As the potential decays to zero at the far- 
held, the commonly used periodic and homogeneous boundary conditions, imposed 
on the boundary of the rectangular domain, do not agree very well with the far-held 
asymptotics. Errors coming from the boundary condition approximation dominates 
as the mesh size tends smaller. The saturated accuracy achieved by Fourier/Sine 
pseudospectral methods, also referred to as “locking” accuracy, improves when the 
domain size increases 3, [4], 6, 7. While for the 2D Poisson potential, periodic or 
homogeneous boundary condition approximation is totally inappropriate, because the 
potential diverges, i.e., u(x ) —> Cln |x|, C > 0 as x —> oo. For this case, exact artihcial 
boundary conditions on a disk were given by Zhang and Mauser 29 , and boundary 


conditions on the rectangular domain remain to be further explored. However, not all 
interesting potentials can be formulated via PDE [5j[6|, therefore, we start from the 
convolution or the Fourier integral. 

Starting from the Fourier integral 0, simple plane-wave discretizations suffer 


serious accuracy loss due to improper treatment of the singularity in U 6.7 


21 


. For 


kernels with removable singularity in spherical/polar coordinates, e.g., the 3D/2D 
Coulomb potential, Jiang, Greengard and Bao (26 proposed an accurate and efficient 
method by splitting the kernel into long-range regular and short-range singular part, 
and evaluating the quadrature via Fast Fourier Transform (FFT) and the nonuniform 
FFT (NUFFT) [l5j, respectively. This approach was recently adapted to the 2D 
Poisson potential case |6 , whose singularity is too strong to be cancelled out in polar 
coordinates. Their method can achieve spectral accuracy with great efficiency that is 
inherited from the FFT and NUFFT algorithm. However, it is not ideal because of 
the large prefactor in front of the O(NlnN) coming from the NUFFT 15 26 . More 


importantly, the 3D Coulomb/dipolar evaluation is rather slow and needs further 
investigation for potential simulations. 
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It is more natural to start from the convolution form (1.1). In fact, there has been 
lot of work on this problem, see [8|[T4|[21-23 . A basic idea is to modify the kernel 
somehow and to evaluate the long-range interaction efficiently. Several methods have 
been proposed, such as the Ewald-type partition 1 28 , kernel-truncation 14 , Gaussian- 


sum (GS) approximation ; 8 22 23 etc. Among these approaches, the Gaussian-sum 
based method is one of the most effective and accurate solvers. The Gaussian-sum 
approximation has been studied intensively, we refer the readers to MlH- 
In j8], Beylkin et al. split the 2D/3D Helmholtz potential into a convolution with 
a Gaussian-sum in the spatial domain and band-limited multiplier in the Fourier 
domain. Later, Genovese et al. [22] solved the Poisson potential by combining inter¬ 
polating scaling function (ISF) representation of the density and the Gaussian-sum 
approximation. The resulting discrete convolution was accelerated by FFT and it is 
ideal for parallelization. However, the optimal accuracy, around 10-digits, is limited 
by the resolution of the kernel’s GS approximation and also the neglected near-field 
correction integral. 


Here we aim to design a method to combine the advantages of the NUFFT and the 
ISF Poisson solver. To this end, we shall adopt the Gaussian-sum approximation for 
the regular long-range regular integral and compute the near-field correction integral 


with local interpolations instead of global spectral interpolation, as in 26 


We first use a separable approximation to split the potential into a long-range 
regular and a short-range singular integral. The separable approximation is chosen as 
a Gaussian-sum and computed by sine quadrature [33] within a very high-resolution, 
i.e., about 10 -16 in relative norm, over an interval [<5, 2] with a relative large <5 = 
10 -3 or 10 -4 . This grants us the possibility to restrict the local correction in the 
interval [0,5]. We mention that this approach has already been proven effective by Exl 
and Schrefl [l9] in the context of computational micromagnetics. Plugging the finite 
Fourier series expansion into the Gaussian convolution, the evaluation boils down 
to four Fourier transforms (forward and backward pair counted as two). The sine 
approach gives us a suitable, fast and easily adaptable way to obtain a Gaussian-sum 
approximation on a given interval within a prescribed accuracy. However, we shall 
remark that the sine approach does not lead to an optimal approximation in terms of a 
minimal amount of Gaussian terms. Best approximation of singular kernels by means 


of numerical optimization was considered and successfully applied in 13 14 19 25 


In the computation of the regular integral, the tensor product structure of the 
Gaussian-sum approximation is exploited for accurate and stable pre-computation 
of the coefficients, which are given by higher-dimensional integrals. In practice, the 
potential has to be solved many times, so it is worthwhile storing the precomputed 
coefficients so as to save storage and CPU-time. The near-field correction integration 
over the small ball Bs := {x G R rf ||x| < d} is computed based on a low-order Taylor 
expansion of the density. Similarly, the derivatives involved are also computed via 
FFT, therefore the near-field computation is also suitable for parallelization. We vali¬ 
date our approach for different types of nonlocal potentials, i.e., the 3D/2D Coulomb 
potential, the 2D Poisson potential and the 3D dipolar potential. Furthermore, an 
adaption to Davey-Stewartson nonlocal potential 34 is presented. 


The paper is organized as follows. In Section 2, we describe the algorithm, which 
consists of three steps: reformulation, long-range regular integral evaluation, short- 
range singular integral evaluation, followed by extensions to 3D dipolar potential. 
Detailed error analysis is also given. In Section 3, we present details on the Gaussian- 
sum approximation of two kernels, i.e., the Coulomb kernel 1/r and 2D Poisson kernel 
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In r, by sine quadrature. In Section 4, extensive numerical results are given to illustrate 
its performance in both accuracy and efficiency. Finally, some concluding remarks are 
drawn in Section 5. 

2. Numerical algorithm. For the sake of consistency, we first rescale the prob¬ 


lem and introduce the reformulation of (1.1). The key formulation is given in (2.5)- 


(2.6). The subsequent subsections explain the computation of the regular integral and 


the correction integral, as well as extension to 3D dipolar potential and application 
to 2D/3D Coulomb potential with anisotropic densities. 

2.1. Preliminary discussion. As assumed in the beginning that the density is 
smooth and fast decaying, we can reasonably assume that the density is compactly 
supported in a square box Bl := \—L, L] d C R d , d = 2, 3 with a controllable precision. 
For the sake of simplicity, we choose a square box B l, and it will be shown in the 
forthcoming subsection that a general rectangular box is also feasible. The domain 


is also the domain of interest for the nonlocal potential u fll.lto 


Following a standard scaling argument, we first rescale the density to be compactly 
supported in the unit box Bi, i.e., 


x = x L, p(x) = p(x), ==> x g Bi, supp(p) C Bi. 

Plugging (2.1) into the convolution (|1.1|), we have 


( 2 . 1 ) 


u(x) = / U(x-y)p(y)dy = / U(x- y)p(y)dy = L / U(x - y)p(y)dy. (2.2) 
JR d Jb l J Bj 

Particularly, for the 2D/3D Coulomb potentials, we have E/(x) = t/(x) = U(ScL) = 
L _ 1 t/(x), therefore, 


u(x) = u(x) = L‘ 


i -1 


U{x-y)p(y)dy , 


xgBi, d = 2,3. 


(2.3) 


Similarly, for the 2D Poisson potential, we have 


t(x) = u(x) = j B p{y ) ln 


|x-y|<iy-T 


In l( p(y) dy, xgBi.(2.4) 
J Bi 


Notice that the domain of interest is also rescaled to the unit box Bi, therefore, the 
evaluation of u(x) on B/, is equivalent to computing u(x) on the unit box with rescaled 
density jo(x), which is also compactly supported in Bi. We shall omit ~hereafter for 
simplicity. In practice, the computation domain Bi is usually discretized uniformly 
in each direction, and the density is given on the uniform grids 7h- 


T h = {(xf i ) ,...,x < f d > )\ x< f' > = -1 +j p h( p \h {p) = 2/rip, 1 < j p < n p , p = 1 ,..., d}. 


As discussed earlier, one of the key ideas is to use a separable GS approximation 
of the convolution, to be elaborated in Sec. |3.2| and |3.3| to reformulate the potential 
into two integrals, namely, the long-range regular integral and short-range singular 
integral. To be precise, we can reformulate the potential as follows: 

u(x) = [ U G s(y) p(x-y)dy+ [ (U{y) - U GS (y)) p(x - y)dy (2.5) 

J s. d J«. d 


:= /i(x) + / 2 (x), 


x g Bi, 


( 2 . 6 ) 
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where Uqs-i the GS approximation of the kernel, is given explicitly as follows: 


S 


Ugs( y) = f/Gs(|y|) := ^ 

9=0 


s" T « |y|2 , SeN + , 


(2.7) 


with weights and nodes {{w q , r q )}^ =0l /i(x) and / 2 (x) are the long-range regular inte¬ 
gral and short-range singular integral (also named as correction integral ), respectively. 
It is noteworthy to point out that the singularity of the integrand of / 2 at the ori¬ 
gin in physical space is cancelled out in spherical coordinates by the Jacobian of the 
coordinates transform. 

Another important feature is our high-resolution GS approximation of the singular 
kernel U over a interval excluding the origin r = 0. In fact, with sine quadrature 
17,33 , the GS approximation error e, measured in relative/absolute maximum norm, 


over the interval [<5, 2] can be achieved up to machine precision. In our algorithm, the 
parameter 6 does not have to be chosen as small as in [ 22 | and we can choose some 
intermediate value, e.g., 10 - 4 ,10~ 3 . With accurate GS approximation, the integral 
domain of the correction integral can be compressed to a small neighbourhood of the 
origin, i.e., Eg. Thus the correction integral evaluation can be done with some Taylor 
expansion. Details are to be presented in subsection |2.3[ Detailed description of the 
evaluation of I± and I 2 is to be shown in subsection |2.2| and |2.3[ respectively. 

2.2. Evaluation of the regular integral I i(x). Due to the compact support 
assumption of the density, plugging the explicit GS approximation (2.7) into I i(x), 
we have 


( 2 . 8 ) 

(2.9) 

( 2 . 10 ) 


r S 

/ w q e -T 9lyl“p( x — y)dy, 

J ^ d 9=0 

x £ Bi 

s f 

^w q \ w T2 lyl 2 p (x-y)dy, 

g =0 Jb x,i 

x £ Bx 

'S' /* 22 

w i e~ r2 lyl 2 p (x-y)dy, 

9=0 

x £ Bx 


where B x 1 := Bi + x is the unit box centred at x. Identity (2.10) holds because 
p(x — y) = 0, V x € Bi, y £ B 2 \ B x 3 . For x € Bi and y £ B 2 holds x — y £ B 3 , 
and we can approximate the density on B 3 by Fourier pseudo-spectral method with 
To be more specific, a simple zero-padding of the density from 


spectral accuracy 35 


B x to B 3 is applied first, and the padded density p is well resolved by the following 
finite Fourier series: 


P( z ) 


k j =1 


n 


— a 


e 0 


z = (*«..., zW)£B 3 , (2.11) 


where aj = —3, bj = 3, j = 1... d and k = (ki, ..., kd) £ with kj = — nj/2, ..., nj/2— 
1 and nj = 3 nj . The Fourier coefficients are determined as follows: 


Pk = 


B 3 


p(z) F[e" 

J=i 




dz. 


( 2 . 12 ) 
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where |B 3 i = n; =1 (^ — cij) is the volume. The above integral is then approximated 
by a trapezoidal rule, which can help achieve spectral accuracy, and the summation 
is accelerated by Fast Fourier Transform (FFT) [32] . 

Plugging (2.11) into (2.10), we have 


s r 

U(x) = / e - Ulyl'p( x -y)dy 

q —0 1 


(2.13) 

S d 2 7ri kj 

f d —27 ri fc 7 - 

j e~ T i l y| W e b s~ a i dy 

(2.14) 

9=0 k j -1 

j=1 


S d o iri h ■ / 

v—- v—- -»—r 3 n ■ 

I s - 

-*5 

to 

O 

1 

to 


= Y, w ^Yjp^Ii e bj ~ aj 

'11/ e~ T i [y '" 1 e 6 i-“i dyU> 

(2.15) 

•£5 

II 

O 

9? 

1^ 

3=1 ~ 2 


k \q=0 / j=l 

—dj) 

1 

(2.16) 


where 


G 2 = II 


d r 2 -2 w W) 

-r 2 M j) \ 2 ~ 


e « 1 y 1 e 3 j 




i=i 


'—2 


II / 2e T92|y<J>|2 COS ( 2 hj J -a- J) ^ y(3) - 


(2.17) 


1=1 ^ 


The coefficients in (2.17) are tensor products for any fixed index q. Notice that G^. 
does not depend on the mesh size h := (h 1 ,..., h d ) T or the density p. Therefore, it can 
be pre-computed, which greatly enhances the efficiency of the evaluation of 7i, because 
the potential is usually solved many times in simulations. For a given discretization, 
we can pre-compute and store the sums of coefficients, i.e., '^2q =0 w q G^, which helps 
decrease the CPU-time dramatically at a small expense of storage. To compute G£, 
we only need to calculate three 1 -dimensional vectors whose components are given as 


integrals. The integrals in (2.17) can be evaluated numerically by a Gauss-Kronrod 
quadrature up to machine precision 30 . Once Y2q=o w q^ia known, Ii(x) can be 


computed by (2.16) and the summation can be accelerated by FFT. 

Remark 2.1. Actually, we can restrict the zero-padding to B 2 and apply the 
Fourier series approximation (2.11) on B 2 instead of B 3 . This can be inferred from 


the fact that the 4- periodic and the 6 - periodic extension of the density coincide on B 2 , 
cf. Fig. 2.1. Correspondingly, the constants in (2.11) will be changed to aj = —2,bj = 


2 ,rij = 2 n.j,j = 1 ,,d. 

2.3. Evaluation of the correction integral / 2 (x). To evaluate / 2 (x), we first 
split it into two integrals as 

7 2 (x) = [ (U(y) — Ugs(y)) p( x — y)dy, x e Bi (2.18) 

J R d 

= {[ + / J ( [/ (y)- c/ Gs(y))p(x-y)dy, xeBi (2.19) 

\Jb s Js. d \B s J 

:= 7 2 ,i(x) + I 2 , 2 (x), xeBi. (2.20) 
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-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 

Fig. 2.1. The 4-periodic (green-solid) and the 6 -periodic (black-dotted) extension coincide on B 3 . 


As can be inferred from the compactness assumption, i.e., supp(p) C Bi, we 
have for x £ Bi that supp{p(x — y)} C B 2 . Therefore, the latter integral / 2 , 2 (x) is 
equivalent to an integral defined on a bounded domain, i.e., V := B 2 \B$. To be more 
precise, we have 


-f2,2(x) = [ (U(y) — U G s(y)) p( x — y)dy, x e Bi (2.21) 

JR d \B s 

= ( (U(y)-U G s(y))p(x-y)dy, xeBj. (2.22) 

Jv 

Since the GS approximation of U gives an error e on 2b, by adopting in spherical/polar 
coordinates, we have 


|/ 2 , 2 (x)| < |S d_1 | ML f 2 r d_1 \U(r)-U GS (r)\dr 
Js 

< C HpIIoo max | (U(r) - U GS {r)) |, 
r£[<5,2J 


(2.23) 

(2.24) 


where jS 11 


d— 11 


2ir d ' 2 

T (d/2) 


is the volume of unit surface in and C is a constant not 


depending on the density. Then we neglect I 2 ,2 because of the near-machine precision 
accurate GS approximation in (2.23). We refer to Sec.[3]and Fig. 3.1 for more details. 


Remark 2.2. In (2.231, the error estimate does not really have to depend on the 


density p, simply because we can normalize the density to be 


= 1. 


In order to compute J 2j i, we need first to interpolate the density function in a 
^-neighborhood of x. Since <5 is small (we choose S = 10 -3 or ICR 4 in our implemen¬ 
tation), the interpolation of the density p x ( y) := p(x — y) within Bg can be done by 
the Taylor expansion. To be exact, we have 

Px(y) = P x (y) + Rx(y), y e B s , (2.25) 

where P x (y), the third order Taylor expansion, is defined as follows: 


P*(y) = ft,(o)+ £%)%, 


2=1 


9yj 


1 <9 2 p x ( 0) 

2 dy 0 dy k 


VjVk 



j,k,l=1 


d 3 p x (0) 

d yj dy k d yi Vjykyi 


(2.26) 


and the remainder R x (y) = C(p,x)|y| 4 with the constant C(p,x) depending on the 
density p and x. Spherical coordinates are now used in order to remove the singularity. 
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Next, we plug the spherical representation of (2.261 into (2.20). After integration over 


the r,9,(j) respectively, the evaluation of / 2 ,i conies down to simple multiplication of 
A p and some constants, since the contributions of the odd derivatives in (2.26) and 


off-diagonal components of the Hessian vanish. It is noteworthy to point out that we 
do not have to resort to any numerical quadrature here. 

The approximation error of 1-2a by I 2.1 (the integral with Taylor expansion) is 
estimated as follows: 


K/ 2,1 --? 2 ,i)(x)| = / (U(y) - U G s(y))C(p,x)\y\ 4 dy 


IBs 


< ||C(p,x)|| 00 |S' 


d—1 I 


r d 1 r 4 |C/(r) — I/Gs(r)|dr 


(2.27) 

(2.28) 


slICMius-icM <“»> 


where 


h,i(-x) = (t/(y)- u G s{y))P x (y)dy, 

Jb s 


(2.30) 


and Cs is a positive parameter depending on the weights of the Gaussian-sum ap¬ 
proximation (cf. Sec. [3]). 


Remark 2.3. The main computational work relies on the FFT. Since there are 
several successful versions of distributed-memory parallel FFT implementations, e.g., 
parallel version of the FFTW and its extensions J2£, 31l, the performance of our 
method can be enhanced with such libraries. 


2.4. Extension to the dipolar potential. The dipolar potential is of great 
importance in condensed matter and quantum mechanics 
volution form, i.e., u(x) = U * p where 


36 37 . It also takes con- 


t/(x) = 


3 m • n — 3(x • n)(m • x)/|x( 
47 T Ixl 3 


(2.31) 


= —(m • n)<5(x) — 3<9 n 


4-7t|x| 


x G 


Using the convolution theorem, we can rewrite the dipolar potential as follows: 


t(x) = -(m • n)p(x) + d n 


f 1 


\ 47 r|x| 


P = — (m • n)p(x) + 


1 


47t|x 


* ( d nm p ). (2.32) 


Therefore, the computation of u consists of the evaluation of the 3D Coulomb poten¬ 
tial with the source term d nm p(x). Since the density p(x) is smooth and compactly 
supported in B^, it can be approximated by finite Fourier series with spectral accu¬ 
racy, and so does the second derivative d nm p{x). The source term <9 nm p(x) can be 
easily computed with arithmetic operations of the discrete Fourier coefficients. 

We note that a similar situation arises in the Davey-Stewartson nonlocal potential 
in 2D, where one could solve Poisson’s equation with the second derivative of the 
density as source term. Hence, in the convolution form, one only has to convolve the 
2D Poisson kernel with the second order derivative of the density, cf. example [5] in 
Sec. gj 
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For readers’ convenience, we summarize the key steps of our algorithm in Algo¬ 
rithm [lj 


Algorithm 1 Evaluation of the nonlocal potential (1.1) 


Precomputation 

1. Gaussian-sum approximation of the kernel U(x) in (1.2). 

2. Fourier coefficients G^. in (2.17) via its tensor product composing vectors. 

Actual computation 


1. Compute pk, cf. (2.12). 

2. Evaluate I\ by (2.161 with FFT. 


3. Compute the Laplacian of p with FFT. 

4. Evaluate / 2 ,i by (2.30). 

5. Add I\ and / 2 i to obtain the approximation of u. 


2.5. Anisotropic densities. For clarity, we assume that the (rescaled) density 
is compactly supported in the rectangular box Bi > 7) := [—1, l] d—1 x ry[—1,1]. Here, 


the regular integral and correction integral in Sec. 2/2 and |2.3| needs modifications 
accordingly. More precisely, for regular integral evaluation, cf. Sec. |2.2[ the related 
changes are listed as follows: 


J i( x ) = \^2 w q G l 

k \q=0 / 


2,r ‘ kd (xca.i) 
e e( b d~ a d)^ X £ad) 


^_ } 2t xi kj / n 

n e^f (x - ai) , (2.33) 

j =i 


where 


Gl = 


C2V 21 (d) 12 - 2 ^ ik d y {d ) d 1 r2 


e — T q\y W \ 2 e dy W J] / e ~Tq\y U) \ 2 

'-IV ,=!■ ~ 


„(j) 


'dy u \ (2.34) 


'-2 


For the correction integral J 2 , one has to choose <5 smaller than rj so as to guarantee 
the validity and accuracy in the Taylor expansion. Numerical results for the 2D/3D 
Coulomb potentials are displayed in Section 0 cf. Tab. |4.5| and Tab. |4.3| 

Remark 2.4. Given a general rectangular domain, e.g., the 2D [—L X1 L X ] x 
[—Ly,L y ], the above algorithm adapted for anisotropic densities can then be adapted 
simply by setting rj = mm{L x /L y , L y /L x }. The 3D Coulomb potential evaluated on a 
more general rectangular domain can also be adapted similarly. 


3. Kernel approximation. The kernel’s high-resolution approximation is of 
great importance in our algorithm. Here we choose the effective GS approximation, 
which has already been exploited extensively in 9. Ill 13.33. Its tensor product 
structure leads to a considerable simplification of the pre-computation (2.17) in the 
regular integral I\ evaluation. The higher resolution in the GS approximation achieved 
with sine quadrature allows us to neglect the integral / 2j2 , cf. (2.23), thus, confines 
the near-field correction integral into a small ball Bs- 

The sine quadrature approach to obtain the GS approximation relies on a Gaus¬ 
sian integral representation of the kernel U. In this section, we briefly review some 
facts of the sine-quadrature 25 33 to make our paper reasonably self-contained, 


then present the concrete approximations of the kernels 1 jr and lnr on an interval 

[d, 2], 0 < 6 <C 1. 
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3.1. Sine quadrature. The sine function sine(f) := is an analytic func¬ 

tion, which equals to 1 at t = 0 and zero at t £ Z\ {0}. Sufficiently fast decaying con¬ 
tinuous functions / £ C(R) can be interpolated at the grid points tk = kft £ t®, ft > 0 
(step size) by functions Sk,p{t) := sine (t/ft — k), i.e., 

/4*)=^/(^)5 M (t). (3.1) 

Since J R sinc(t) dt = 1, an interpolatory quadrature for f R f(t) dt is given as follows: 

[ f(t)dt^ftJ2f(kft), (3.2) 

•' R fee z 

which can be viewed as “infinite trapezoidal rule” quadrature. Finite truncation to the 
first 2 S + 1 terms, i.e., k = —S ,..., S, of the infinite sum leads to the sine quadrature 
rule with the error ft'}Z\ k \ > s f(kft) depending on the decay-rate of /. For functions 
f(z ) in the Hardy space H 1 (D\), X < 7 t/2 , that is to say, f(z) is holomorphic in the 
strip D\ := {z £ C : |3z| < A} and 

N(f,D\) := f \f(z)\ \dz\ = f (\f{t + iX)\ + \f{t-i\)\)dt <oo, (3.3) 
JdD x JR 

and if f(z) also satisfies the double exponential decay property on the real axis, 
we have the following exponential error estimate for sine quadrature approximation, 
see [25] (Proposition 2.1). 


Proposition 1 ( 25 ). Let f £ H 1 (D\) with X < 7r/2. If f satisfies the double 
exponential decay condition, i.e., 

\f(t)\ < C exp(— 6 e“^) VfsM with a,b,C > 0, (3.4) 

then the quadrature error for the special choice ft = hi( 2n f s )/(aS) satisfies 


f f(t)dt-ft /( fcl? ) 

Jr 


|fe|<5 


< C N(f,D\) exp ( 


—2irXaS 
Vln(27r aS/b) 


(3.5) 


Remark 3.1. In the case of an integral expression f R g(t) e xh ^ dt, the constants 
in (3.5) depend on the parameter x. For some fixed x, an accuracy of e > 0 can be 


achieved with S := 0(|lne| ■ ln|lne|). Moreover, in our computations, we use the 


simplified step-size ft = Cq\u.{S)/S, cf. (3.5), as in f25\j with some positive constant 


Co (i.e., Co = 2.1 for the Coulomb kernel and cq = 1 for the Poisson kernel). 

In the following, we first represent the kernels l/|x| and ln|x| in Gaussian inte¬ 
gral form and then apply the sine-quadrature to obtain a GS approximation. These 


approximations are valid in an interval [<5, 2 ] and used to split the convolution ( 1 . 1 ) 
into a regular integral and short-range correction integral, cf. (2.5). 


3.2. Approximation of the Coulomb kernel 1/r over [5,2]. Starting from 
the following identity 


T a e~ pT dT = T{^±l)p~ 


q+1 


p > 0, a > —1, 


(3.6) 
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ln(r) 



Fig. 3.1. Number of terms S versus E Ie i for the kernel 1/r (left), E^ s forhir (right) on [5,2]. 


for a = 0 and p = |x| 2 , we have a Gaussian integral representation 

o /*oO O coo ^ 


1 

R 


/ 7T 


r 0 


e-N 2 - 2 

Rr, 


n 


- W V dr. 


(3.7) 


0 p=i 


Applying some numerical quadrature to the integral / 0 °° e pT dr leads to a GS approx¬ 
imation 


]-^5> 9 neR 


(3.8) 


3 = 1 


Remark 3.2. For kernels l/|x|^, f3 > 0, choosing a = /? — 1, formula ( 3.6 ) 


gives a Gaussian integral representation similar to that of the Coulomb kernel. Sub¬ 
stituting p = |x| 2 and applying the numerical quadrature, we obtain a similar GS 
approximation. 


The numerical quadrature we choose here is the sine quadrature , cf. Sec. 3.1 


which is suited for integrals f R f(t) dt with / £ C'(K) decaying sufficiently fast. More 
precisely, by a change of variables in ( |3.7| ), i.e., r = sinhf := ^(e f — e -t ), the updated 
integrand, now a function in Hardy space with double exponential decay on the real 
axis, satisfies the condition of Proposition [Tj A sine quadrature applies readily and 
the quadrature converges exponentially with respect to the number of Gaussian terms. 
The integrand is an even function, and we shall end up with only S + 1 terms. 

A detailed analysis (similar to that in 17|) shows that the quadrature for 1/r, r = 
|x| is acceptable for an interval r £ [<5, 2], 0 < <5 <§£ 1. The left picture of Fig. 3.1 shows 
the relative error E re \ of the GS approximation, where E le \ := || 1—X)f=o w q r e ~ Tqr ||l« 
from where one could observe the high-resolution approximation. 

3.3. Approximation of the Poisson kernel lnr over [5,2]. In the sub¬ 
section, we shall present a GS approximation for the 2D Poisson kernel In |x| := 

In vRR+RR. Setting a = 1 in (3.6), we have 


xG ) 2 + xG ) 2 


-(xW' 


+x 


( 2)2 


dr. 


(3.9) 


Applying a change of variables r = In (1 exp(sinht)), the integration domain in 


(3.9) is now the whole real axis and the integrand has double exponential decay. 


((5,2]), 
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Thus, we can apply the sine-quadrature (2 S + 1 terms in this case) to obtain a GS 
approximation of |x |~ 2 in [1,1?], R > 1, from which we can change to the interval 
[5,2] following a scaling argument. Inserting the GS approximation of l/|x | 2 into the 
following formula 

r - /-ad 1 ) 

In t/a;! 1 ) 2 + a;( 2 ) 2 = / - k dv. (3.101 

V VIGxC^ + zG ) 2 V ' 

we obtain an GS approximation as follows: 


V 


In \/z (1) +z (2) -C- 


- y ^w„e 

<7=1 


-T q (: 


.(!)' 


+X 


( 2 )‘ 


] =:J2 w i e 

<7=0 
1/2 




(3.11) 


where wq = Co, w q = —w q , q > 1 and To = 0, T q = T q , q > 1. We point out that 
the coefficients w q and r q in (3.11) should be computed stably (double precision) for 
both small and large nodes r q . The right figure in Fig. 3.1 shows the absolute error 

Kabs := II lnr — w q e _T ’ r2 ((< 5 , 2 ]) for the kernel lnr on [<5,2], 

4. Numerical results. In order to demonstrate the accuracy and efficiency 
performance of our method, we perform several numerical tests in this section. All 
the numerical errors are calculated in the relative maximum norm, which is defined 
as follows 


_ \\u-uz\h- _ max xgTh |u(x)-m k (x)| 

IMIi- niax xer , |u(x)| ’ 1 • J 

where 7h is the rectangular computational domain discretized uniformly in each di¬ 
rection with mesh sizes h = ( h x , h y ) T and (h x , h y , h z ) T for 2D and 3D, respectively. 
The grid function u ^ is the numerical solution and u is the exact solution. Further, 
we denote the total number of grid points by N := n x n y n z and N := n x n y for the 3D 
and 2D domain, respectively. For the sake of convenience, the mesh size in the p -th 
direction h p is simply denoted by h hereafter. 


The algorithm is implemented in Fortran, the code is compiled by ifort (version 
14.0.2) using the option -03, and executed on 64-bit Linux on a 2.53GHz Intel(R) 
Xeon(R) E5540 CPU with 6 MB cache. The CPU times shown in this section do 
not include all the pre-computation times. The exclusion of all the pre-computation 
steps is justified by the fact that in many applications/real simulations, one needs to 
evaluate the nonlocal potential multiple times on the same grid. 


Example 1. The 3D Coulomb potential. For the density p(x) := e +v +1 z ^ 
with cr > 0 and 7 > 1, the 3D Coulomb potential, with the kernel U(x) = 4 ^ x | , can 
be computed analytically as 

f ^fErf(M), 7=!, 

x G M 3 , (4.2) 


where Erf(a;) = Je ~ 4 dt for x G K is the error function. For densities p Xo (x) := 
p(x — xo) with xo G R 3 , the corresponding 3D Coulomb potential is given exactly as 
u Xo (x) = u(x - x 0 ). 


- 


z* 

er 2 (i+l) e a 2 (t+7 —2 ) 


o± r<* _ 

4 7 JO (i+1 


dt, 7 7 ^ 1 
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The 3D Coulomb potential is computed on [— L,L] 2 x [— L/ r y,L/'y] with mesh 


size h x = h y ,h 


= hx/'y. Table 


4.1 


shows the error E and computation time for 
the isotropic density, i.e., 7 = 1 , with cr = 1.2 on different domains [—L, L] 3 , where 
Ti,T 2 and T to tai denote hereafter the time for the evaluation of I\,I 2 in (2.5) and 


the total time, respectively. Table |4.2| presents the results of the potential fo r sh ifted 
density with a = 1.2 and x 0 = (1,2,1) T computed on [—12,12] 3 . Table 4.3 lists 
the errors E and timings for different anisotropic densities with a = 2 computed 
on [— 12 , 12] 2 x [— 12 / 7 , 12 / 7 ] using the same mesh size in x and y-direction, i.e., 
h x = h y = 1/4 and a different mesh size in z-direction, i.e., h z = hx/'y. 

From Tab. 4.1||4.3 we can conclude that: (i) The method is spectrally accurate 
with respect to the mesh size h and efficient with a complexity of 0(N In N); (ii) The 
anisotropic potential can be computed with spectral accuracy without increasing the 
memory or CPU time as the 7 tends larger, thus, it is ideal for applications. 


Table 4.1 

Errors and timings of the 3D Coulomb potential in Example [7] with isotropic density with 
a = 1.2 on [-L, L] 3 . 


L = 8 

N 

E 

T\ 

t 2 

Ttotai 

h= 1 

16 3 

1.096E-03 

9.99E-04 

1.00E-03 

2.00E-03 

h= 1/2 

32 3 

1.130E-09 

1.60E-02 

2.00E-03 

1.80E-02 

*.=1/4 

64 3 

6.169E-16 

1.93E-01 

1.90E-02 

2.12E-01 

*=1/8 

128 3 

6.187E-16 

1.69 

6.28E-01 

2.31 

*=1/16 

256 3 

7.725E-16 

15.03 

4.71 

19.74 

L = 16 

N 

E 

T\ 

t 2 

Ttotai 

h= 1 

32 3 

1.113E-03 

1.60E-02 

2.00E-03 

1 . 8 OE -02 

*=1/2 

64 3 

1.191E-09 

1.95E-01 

2.10E-02 

2 . 16 E -01 

*=1/4 

128 3 

9.259E-16 

1.71 

6.22E-01 

2.33 

*=1/8 

256 3 

9.271E-16 

15.18 

4.76 

19.94 


Table 4.2 

Errors and timings of the 3D Coulomb potential in Example^for shifted Gaussian density with 
0 = 1.2 and x 0 = (1,2,1) T on [—12,12] 3 . 


L= 12 

N 

E 

T\ 

t 2 

Ttotai 

h= 1 

24 s 

1.108E-03 

7.00E-03 

4.00E-03 

1.10E-02 

*=1/2 

48 3 

1.175E-09 

8.10E-02 

1.20E-02 

9.30E-02 

*=1/4 

96 3 

6.182E-16 

7.03E-01 

1.08E-01 

8.11E-01 

*=1/8 

192 3 

7.717E-16 

6.30 

1.08 

7.37 


Table 4.3 

Errors and timings of the 3D Coulomb potential in Example^ for anisotropic densities with 
a = 2 computed on [—12,12] 2 X / [—12,12] with h x = h y = 1/4, h z = h x / 7 (N = 96 3 ). 


7 

E 

11 V* 11 max 

Pi 

t 2 

Ttotai 

1 

4.486E-16 

2 

6.76E-01 

1.01E-01 

7.77E-01 

2 

5.599E-16 

1.209 

6.83E-01 

1.02E-01 

7.85E-01 

4 

1.427E-15 

0.681 

6.81E-01 

1.00E-01 

7.81E-01 

8 

2.606E-14 

0.364 

6.78E-01 

1.03E-01 

7.81E-01 
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Example 2. The 2D Coulomb potential. For the density p(x) = e ( x 2 + 7 2 y 2 )/<r 2 
with (j > 0 and 7 > 1, the 2D Coulomb potential, with the kernel f/(x) = 27 r| x | ', can 
be obtained analytically as 


i(x) = 


2 


~ A 0 \ 2 ^) 


a r c 
’yyfn JO 


+ 1) e cr^(t^+y-^) 


/P+1 y/t 2 +~/~ 2 


7 = 1 ) 


dt, 7 7 ^ 1 , 


x G . 


(4.3) 


where Io is the modified Bessel function of the first kind 1 . 

Similarly, we shall first present the accuracy and efficiency performance of our 
method on fixed domains [— L,L] 2 with cr = 1.2 in Table |4~4| Then we compute the 
2D Coulomb potential for anisotropic densities on [-L, L] x [-L/ 7, Lj 7] using a fixed 


mesh size in x-direction, i.e., h x = 1/4 and h y = h x j 7 , cf. Table 4.5 From Tab 


4.4 


and Tab |4.5[ we can draw similar conclusion as that in the 3D Coulomb example and 
we omit it for brevity. 


Table 4.4 

Errors and timings of the 2D Coulomb potential in Example [|/or a 
different mesh size. 


1.2 on [—L,L] 2 with 


L = 8 

N 

E 

T\ 

t 2 

Ttotal 

h= 1 

16 2 

9.426E-04 

0 

0 

0 

h= 1/2 

32 2 

1.720E-09 

0 

0 

0 

h= 1/4 

64 2 

4.190E-16 

2.00E-03 

1.01E-03 

3.00E-03 

k=l/B 

128 2 

5.229E-16 

6.00E-03 

2.00E-03 

8.00E-03 

h= 1/16 

256 2 

5.229E-16 

2.30E-02 

7.01E-03 

3.00E-02 

I, = 16 

N 

E 

Ti 

t 2 

'T'total 

h = 1 

32 2 

9.576E-04 

1.00E-03 

0 

1.00E-03 

h=l/2 

64 2 

1.815E-09 

1.00E-03 

0 

1.00E-03 

h= 1/4 

128 2 

5.846E-15 

5.00E-03 

2.00E-03 

7.00E-03 

k=l/S 

256 2 

5.846E-15 

2.60E-02 

7.00E-03 

3.30E-02 

h= 1/16 

512 2 

6.055E-15 

2.47E-01 

2.80E-02 

2.75E-01 


Table 4.5 

Errors and timings of the 2D Coulomb potential in Example []| for anisotropic densities with 
a = 2 computed on [—12,12] x ^[—12,12] with h x = 1/8 ,h y = h x /7 and N = 192 2 . 


7 

E 

11 'tt 11 max 

T\ 

t 2 

Ttotal 

1 

5.047E-16 

1.773 

1.00E-02 

2.00E-03 

1.20E-02 

2 

5.479E-16 

1.217 

1.20E-02 

3.00E-03 

1.50E-02 

4 

4.235E-16 

7.902E-01 

9.00E-03 

2.00E-03 

1.10E-02 

8 

1.402E-15 

4.902E-01 

1.20E-02 

2.00E-03 

1.40E-02 

16 

8.387E-15 

2.935E-01 

1.20E-02 

2.00E-03 

1.40E-02 


Example 3. The 2D Poisson potential. For p(x) := e ^ ! a = e r with 
r = |x| and a > 0, the 2D Poisson potential, with the kernel C/(x) = — ^ In |x|, can 
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be obtained analytically as 


m(x) = 


E 1 (^)+21n(|x|) 


X - ln((J 2 )) , 


x / 0, 
x = 0, 


(4.4) 


where Ei(r) := t~ 1 e~ t dt for r > 0 is the exponential integral function 1 and 
7 e ~ 0.5772156649015328606 is the Euler-Mascheroni constant. 

The 2D Poisson potential is computed on \—L, L] 2 with mesh size h x = h y . Table 
shows the error E and computation time with a = 1.2 on [—L,L] 2 with different 


4.6 


mesh sizes. Spectral accuracy and 0(N In N) efficiency can be observed from Tab. 4.6 


Table 4.6 

Errors and timings of the 2D Poisson potential in Example^with a = 1.2 on \—L,L] 2 . 


L = 8 

N 

E 

T] 

t 2 

Ttotal 

h= 1 

16 2 

3.768E-04 

0 

0 

0 

h= 1/2 

32 2 

3.331E-10 

1.00E-03 

0 

1.00E-03 

h= 1/4 

64 2 

3.623E-15 

2.00E-03 

1.00E-03 

3.00E-03 

h= 1/8 

128 2 

2.988E-15 

6.00E-03 

1.00E-03 

7.00E-03 

*.=1/16 

256 2 

5.085E-15 

2.30E-02 

4.00E-03 

2.70E-02 

L = 16 

N 

E 

T\ 

t 2 

Ttotal 

h= 1 

w 

2.966E-04 

1.00E-04 

0 

1.00E-03 

*=1/2 

64 2 

2.713E-10 

2.00E-03 

0 

2.00E-03 

*=1/4 

128 2 

3.856E-15 

6.00E-03 

2.00E-03 

8.00E-03 

*=1/8 

256 2 

3.164E-15 

2.60E-02 

6.00E-03 

3.20E-02 

*=1/16 

512 2 

6.921E-15 

2.47E-01 

3.00E-02 

2.77E-01 


Example 4. The dipolar potential in 3D. The 3D dipolar potential is defined 


by convolution as follows [3 4 26 


u(x) = — (n • m) p(x) — 3 d n 
= — (n • m) p(x) — 3 


* p 


1 


47t|:? 


47t|x 

* (<9 nm p) 


(4.5) 


where n, m are two given unit vectors in M 3 . Note that the dipolar potential can 
actually be solved via the Coulomb potential by (4.51 with the new source term 
(<9 nm p) ■ Numerically, the source term ( d nm p ) can be easily obtained by differentiating 
the Fourier pseudospectral approximation of the fast decaying density. 

Similarly, we consider a radial symmetric density p(x) = , and the po¬ 

tential is given explicitly as 


u(x) = — (n • m) p(x) — 3 d n 
= -(n • m) p(x) -3d n 


* p 


47t|x 
cr 2 V / 7r Erf(r/<r) 


r/ C 


= — (n • m) p(x) — 3 n 1 D m, 


(4.6) 
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where Sij is the Dirac delta function and the Hessian matrix D is given as follows 


Djj — 5tj ( 2 ^ 2 e 


er 3 -/^ 

4r 3 


XiXj - 


3 (J 2 
2 r 4 


Erf 0J + 

_4 , 3a 3 ^7r 
4 r 5 


Erf ©)' 


i,j — 1,2,3. 


Table 4.7 shows the error and timings of the 3D dipolar potential evaluation with 


<t = 1.2 and two randomly chosen vectors n = (0.82778, 0.41505, —0.37751) 4 , m = 
(0.3118,0.9378,—0.152144 on [—8, 8] 3 . Here, T pre is the CPU time for computing 
the source term d nm p , T), T 2 and Ttotai are the same as those defined previously. We 
observe spectral accuracy and the timings show the expected scaling 0(N In N). 


Table 4.7 

Errors and timings of the 3D dipolar potential in Example 01 with cr = 1.2, n = 

(0.82778, 0.41505,-0.37751) T ,m= (0.3118, 0.9378,-0.15214) T on [-8,8]b 


L = 8 

N 

E 

Tpre 

T 

t 2 

Ttotai 

h= 1 

w 

1.380E-02 

0 

2.00E-03 

0 

2.00E-03 

h= 1/2 

32 3 

2.647E-07 

2.00E-03 

1.50E-02 

2.00E-03 

1.90E-02 

*.=1/4 

64 3 

1.430E-14 

1.70E-02 

2.00E-01 

1.90E-02 

2.35E-01 

*=1/8 

128 3 

4.076E-14 

1.96E-01 

1.68 

2.20E-01 

2.10 


Example 5. The Davey-Stewartson (DS) nonlocal potential. In the DS equa¬ 
tion, the nonlocal potential can be given by a convolution as follows: 

w(x) = In |x| * ( d xx p ), x £ R 2 . (4.7) 

Z7T 

For a Gaussian density p(x, y) = n e~ n ( ' x +v \ the DS nonlocal potential is given 
explicitly, in polar coordinates, as 

$(r, 9) = - e~ n2r2 + cos(20) e~^ r2 (2 7 rr 2 )- 1 (l + ttV - e^’’ 2 )) . (4.8) 

Table |4.8| displays the error and computational time of the DS nonlocal potential on 
[—8, 8] 2 . We observe the scaling 0(N In TV) and rapid decrease of the error as the mesh 
size gets smaller, although the best reached precision is below those of the previous 
2D Poisson/Coulomb examples. In this context, note that the parameter a is larger 
compared to the preceding test in Example [3j i.e., a finer resolution would be needed 
for the more localized density. 


Table 4.8 

Errors and timings of the DS nonlocal potential in Example [7] on [—8,8] 2 . 


L = 8 

N 

E 

T 

t 2 

Ttotai 

*=1/2 

32? 

1.474 

1.00E-03 

0 

1.00E-03 

*=1/4 

64 2 

5.720E-03 

2.00E-03 

0 

2.00E-03 

*=1/8 

128 2 

3.974E-09 

5.00E-03 

2.00E-04 

6.00E-03 

*=1/16 

256 2 

4.536E-13 

2.20E-02 

7.00E-03 

2.80E-02 
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5. Conclusions. Starting from the convolution definition, we presented an effi¬ 
cient and accurate algorithm for a class of nonlocal (long-range) potentials, where the 
densities are assumed to be smooth and fast decaying. The method use a Gaussian- 
sum approximation of the singular convolution kernel to split the convolution into 
two integrals, namely a long-range regular integral and a short-range singular inte¬ 
gral. Due to the high-resolution GS approximation obtained with sine quadrature, 
the regular integral was computed with FFT and the singular integral evaluation was 
done with a low-order Taylor expansion of the density. The algorithm achieves spec¬ 
tral accuracy and is essentially as efficient as FFT algorithms with a computational 
complexity at 0(NInN), where N is the total number of points in the discretization 
of physical space. The method was implemented in Fortran and verified for several 
different potentials, including the 2D/3D Coulomb potential, the 2D Poisson, the 3D 
dipolar potential and the Davey-Stewartson nonlocal potential. 


The algorithm is suitable for parallel computation, e.g., MPI or GPU parallel 
computation and the development of parallel version is on-going. We shall mention 
here that the algorithm could possibly be adapted to the magnetostatic potential in 
stray field computations for micromagnetic simulations 
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